Large Scale Semi - supervised Linear SVM with Stochastic Gradient Descent ⋆

نویسندگان

  • Xin ZHOU
  • Conghui ZHU
  • Sheng LI
  • Mo YU
چکیده

Semi-supervised learning tries to employ a large collection of unlabeled data and a few labeled examples for improving generalization performance, which has been proved meaningful in real-world applications. The bottleneck of exiting semi-supervised approaches lies in over long training time due to the large scale unlabeled data. In this article we introduce a novel method for semi-supervised linear support vector machine based on average stochastic gradient descent, which significantly enhances the training speed of S3VM over existing toolkits, such as SVMlight-TSVM, CCCP-TSVM and SVMlin. We evaluate our method on text categorization and sentiment classification respectively, which indicates its efficiency on large scale semi-supervised tasks.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semi-Supervised Convex Training for Dependency Parsing

We present a novel semi-supervised training algorithm for learning dependency parsers. By combining a supervised large margin loss with an unsupervised least squares loss, a discriminative, convex, semi-supervised learning algorithm can be obtained that is applicable to large-scale problems. To demonstrate the benefits of this approach, we apply the technique to learning dependency parsers from...

متن کامل

Stochastic Dual Coordinate Ascent Methods for Regularized Loss Minimization

Stochastic Gradient Descent (SGD) has become popular for solving large scale supervised machine learning optimization problems such as SVM, due to their strong theoretical guarantees. While the closely related Dual Coordinate Ascent (DCA) method has been implemented in various software packages, it has so far lacked good convergence analysis. This paper presents a new analysis of Stochastic Dua...

متن کامل

Breaking the curse of kernelization: budgeted stochastic gradient descent for large-scale SVM training

Online algorithms that process one example at a time are advantageous when dealing with very large data or with data streams. Stochastic Gradient Descent (SGD) is such an algorithm and it is an attractive choice for online Support Vector Machine (SVM) training due to its simplicity and effectiveness. When equipped with kernel functions, similarly to other SVM learning algorithms, SGD is suscept...

متن کامل

Projected Semi-Stochastic Gradient Descent Method with Mini-Batch Scheme under Weak Strong Convexity Assumption

We propose a projected semi-stochastic gradient descent method with mini-batch for improving both the theoretical complexity and practical performance of the general stochastic gradient descent method (SGD). We are able to prove linear convergence under weak strong convexity assumption. This requires no strong convexity assumption for minimizing the sum of smooth convex functions subject to a c...

متن کامل

Scalable Support Vector Machine for Semi-supervised Learning

Owing to the prevalence of unlabeled data, semisupervised learning has recently drawn significant attention and has found applicable in many real-world applications. In this paper, we present the so-called Graph-based Semi-supervised Support Vector Machine (gS3VM), a method that leverages the excellent generalization ability of kernel-based method with the geometrical and distributive informati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013